Memory-Efficient Backpropagation Through Time
نویسندگان
چکیده
We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes. The algorithm is particularly effective for long sequences. For sequences of length 1000, our algorithm saves 95% of memory usage while using only one third more time per iteration than the standard BPTT.
منابع مشابه
Reviving and Improving Recurrent Back-Propagation
In this paper, we revisit the recurrent backpropagation (RBP) algorithm (Almeida, 1987; Pineda, 1987), discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship...
متن کاملUsing CMAC for Mobile Robot Motion Control
Cerebellar Model Articulation Controller (CMAC) has some attractive features: fast learning capability and the possibility of efficient digital hardware implementation. These features makes it a good choice for different control applications, like the one presented in this paper. The problem is to navigate a mobile robot (e.g a car) from an initial state to a fixed goal state. The approach appl...
متن کاملExtension of Backpropagation through Time for Segmented-memory Recurrent Neural Networks
We introduce an extended Backpropagation Through Time (eBPTT) learning algorithm for SegmentedMemory Recurrent Neural Networks. The algorithm was compared to an extension of the Real-Time Recurrent Learning algorithm (eRTRL) for these kind of networks. Using the information latching problem as benchmark task, the algorithms’ ability to cope with the learning of long-term dependencies was tested...
متن کاملLong Short-Term Memory
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, ...
متن کاملBackpropagation Through Time with Fixed Memory Size Requirements
and ei(t) is the output error, xi(t) represent the activations and δi(t) are the backpropagated errors. The system described by Eq. 1 and Eq. 2 constitute the backpropagation through time (BPTT) algorithm. Note that the backpropagation system (Eq. 2) should be run from t=T backwards to t=1. We define the boundary conditions δi(T+1)=0. We will assume that the instantaneous error signal ei(t) is ...
متن کامل